To address the problem of difficulty in identifying user plagiarism in social networks and to protect the rights of original authors while holding users accountable for plagiarism actions, a plagiarism identification scheme for social network users under blockchain was proposed. Aiming at the lack of universal tracing model in existing blockchain, a blockchain-based traceability information management model was designed to record user operation information and provide a basis for text similarity detection. Based on the Merkle tree and Bloom filter structures, a new index structure BHMerkle was designed. The calculation overhead of block construction and query was reduced, and the rapid positioning of transactions was realized. At the same time, a multi-feature weighted Simhash algorithm was proposed to improve the precision of word weight calculation and the efficiency of signature value matching stage. In this way, malicious users with plagiarism cloud be identified, and the occurrence of malicious behavior can be curbed through the reward and punishment mechanism. The average precision and recall of the plagiarism detection scheme on news datasets with different topics were 94.8% and 88.3%, respectively. Compared with multi-dimensional Simhash algorithm and Simhash algorithm based on information Entropy weighting (E-Simhash), the average precision was increased by 6.19 and 4.01 percentage points respectively, the average recall was increased by 3.12 and 2.92 percentage points respectively. Experimental results show that the proposed scheme improves the query and detection efficiency of plagiarism text, and has high accuracy in plagiarism identification.
The complexity of pedestrian interaction is a challenge for pedestrian trajectory prediction, and the existing algorithms are difficult to capture meaningful interaction information between pedestrians, which cannot intuitively model the interaction between pedestrians. To address this problem, a multi-head soft attention graph convolutional network was proposed. Firstly, a Multi-head Soft ATTention (MS ATT) combined with involution network was used to extract sparse spatial adjacency matrix and sparse temporal adjacency matrix from spatial and temporal graph inputs respectively to generate sparse spatial directed graph and sparse temporal directed graph. Then, a Graph Convolutional Network (GCN) was used to learn interaction and motion trend features from sparse spatial and sparse temporal directed graphs. Finally, the learned trajectory features were input into a Temporal Convolutional Network (TCN) to predict double Gaussian distribution parameters, thereby generating the predicted pedestrian trajectories. Experiments on Eidgenossische Technische Hochschule (ETH) and University of CYprus (UCY) datasets show that, compared with Space-time sOcial relationship pooling pedestrian trajectory Prediction Model (SOPM), the proposed algorithm reduces the Average Displacement Error (ADE) by 2.78%, and compared to Sparse Graph Convolution Network (SGCN), the proposed algorithm reduces the Final Displacement Error (FDE) by 16.92%.
Sparse-dense Matrix Multiplication (SpMM) is widely used in the fields such as scientific computing and deep learning, and it is of great importance to improve its efficiency. For a class of sparse matrices with band feature, a new storage format BRCV (Banded Row Column Value) and an SpMM algorithm based on this format as well as an efficient Graphics Processing Unit (GPU) implementation were proposed. Due to the fact that each sparse band can contain multiple sparse blocks, the proposed format can be seen as a generalization of the block sparse matrix format. Compared with the commonly used CSR (Compressed Sparse Row) format, BRCV format was able to significantly reduce the storage complexity by avoiding redundant storage of column indices in sparse bands. At the same time, the GPU implementation of SpMM based on BRCV format was able to make more efficient use of GPU’s shared memory and improve the computational efficiency of SpMM algorithm by reusing the rows of both sparse and dense matrices. For randomly generated band sparse matrices, experimental results on two different GPU platforms show that BRCV outperforms not only cuBLAS (CUDA Basic Linear Algebra Subroutines), but also cuSPARSE based on CSR and block sparse formats. Specifically, compared with cuSPARSE based on CSR format, BRCV has the maximum speedup ratio of 6.20 and 4.77 respectively. Moreover, the new implementation was applied to accelerate the SpMM operator in Graph Neural Network (GNN). Experimental results on real application datasets show that BRCV outperforms cuBLAS and cuSPARSE based on CSR format, also outperforms cuSPARSE based on block sparse format in most cases. In specific, compared with cuSPARSE based on CSR format, BRCV has the maximum speedup ratio reached 4.47. The above results indicate that BRCV can improve the efficiency of SpMM effectively.
The existing similarity-based moving target trajectory prediction algorithms are generally classified according to the spatial-temporal characteristics of the data, and the characteristics of the algorithms themselves cannot be reflected. Therefore, a classification method based on algorithm characteristics was proposed. The calculation of the distances between two points is required for the trajectory similarity algorithms to carry out the subsequent calculations, however, the commonly used Euclidean Distance (ED) is only applicable to the problem of moving targets in a small region. A method of similarity calculation using geodetic distance instead of ED was proposed for the trajectory prediction of sea targets moving in a large region. Firstly, the trajectory data were preprocessed and segmented. Then, the discrete Fréchet Distance (FD) was adopted as similarity measure. Finally, synthetic and real data were used to test. Experimental results indicate that when sea targets move in a large region, the ED-based algorithm may gain incorrect prediction results, while the geodetic distance-based algorithm can output correct trajectory prediction.
Attackers can illegally open a vehicle by forgeing the Radio Frequency IDentification (RFID) signal sent by the vehicle remote key. Besides, when the vehicle remote key is lost or stolen, the attacker can obtain the secret data inside the vehicle remote key and clone a usable vehicle remote key, which will threaten the property and privacy security of the vehicle owner. Aiming at the above problems, a Vehicle RKE Two-Factor Authentication (VRTFA) protocol for vehicle Remote Keyless Entry (RKE) that resists physical cloning attack was proposed. The protocol is based on Physical Uncloneable Function (PUF) and biological fingerprint feature extraction and recovery functions, so that the specific hardware physical structure of the legal vehicle remote key cannot be forged. At the same time, the biological fingerprint factor was introduced to build a two-factor authentication protocol, thereby solving the security risk of vehicle remote key theft, and further guaranteeing the secure mutual authentication of vehicle RKE system. Security analysis results of the protocol using BAN logic show that VRTFA protocol can resist malicious attacks such as forgery attack, desynchronization attack, replay attack, man-in-the-middle attack, physical cloning attack, and full key leakage attack, and satisfy the security attributes such as forward security, mutual authentication, data integrity, and untraceability. Performance analysis results show that VRTFA protocol has stronger security and privacy and better practicality than the existing RFID authentication protocols.
Ring signature is widely used to solve the problems of user identity and data privacy disclosure because of its spontaneity and anonymity; and certificateless public key cryptosystem can not only solve the problem of key escrow, but also do not need the management of public key certificates; certificateless ring signature combines the advantages of both of the above mentioned, and has extensive research significance, but most of the existing certificateless ring signature schemes are based on the calculation of bilinear pairings and modular exponentiation, which are computationally expensive and inefficient. In order to improve the efficiency of signature and verification stages, a new Efficient CertificateLess Ring Signature (ECL-RS) scheme was proposed, which used elliptic curve with low computational cost, high security and good flexibility. The security statute of ECL-RS scheme stems from a discrete logarithm problem and a Diffie-Hellman problem, and the scheme is proved to be resistant to public key substitution attacks and malicious key generation center attacks under Random Oracle Model (ROM) with unforgeability and anonymity. Performance analysis shows that ECL-RS scheme only needs (n+2) (n is the number of ring members) elliptic curve scalar multiplication and scalar addition operations as well as (n+3) one-way hash operations, which has lower computational cost and higher efficiency while ensuring security.
Cross-project software defect prediction can solve the problem of few training data in prediction projects. However, the source project and the target project usually have the large distribution difference, which reduces the prediction performance. In order to solve the problem, a new Cross-Project Defect Prediction method based on Feature Selection and TrAdaBoost (CPDP-FSTr) was proposed. Firstly, in the feature selection stage, Kernel Principal Component Analysis (KPCA) was used to delete redundant data in the source project. Then, according to the attribute feature distribution of the source project and the target project, the candidate source project data closest to the target project distribution were selected according to the distance. Finally, in the instance transfer stage, the TrAdaBoost method improved by the evaluation factor was used to find out the instances in the source project which were similar to the distribution of a few labeled instances in the target project, and establish a defect prediction model. Using F1 as the evaluation index, compared with the methods such as cross-project software defect prediction using Feature Clustering and TrAdaBoost (FeCTrA), Cross-project software defect prediction based on Multiple Kernel Ensemble Learning (CMKEL), the proposed CPDP-FSTr had the prediction performance improved by 5.84% and 105.42% respectively on AEEEM dataset, enhanced by 5.25% and 85.97% respectively on NASA dataset, and its two-process feature selection is better than the single feature selection process. Experimental results show that the proposed CPDP-FSTr can achieve better prediction performance when the source project feature selection proportion and the target project labeled instance proportion are 60% and 20% respectively.
Most of the existing network embedding methods only preserve the local structure information of the network, while they ignore other potential information in the network. In order to preserve the community information of the network and reflect the multi-granularity characteristics of the network community structure, a network Embedding method based on Multi-Granularity Community information (EMGC) was proposed. Firstly, the network’s multi-granularity community structure was obtained, the node embedding and the community embedding were initialized. Then, according to the node embedding at previous level of granularity and the community structure at this level of granularity, the community embedding was updated, and the corresponding node embedding was adjusted. Finally, the node embeddings under different community granularities were spliced to obtain the network embedding that fused the community information of different granularities. Experiments on four real network datasets were carried out. Compared with the methods that do not consider community information (DeepWalk, node2vec) and the methods that consider single-granularity community information (ComE, GEMSEC), EMGC’s AUC value on link prediction and F1 score on node classification are generally better than those of the comparison methods. The experimental results show that EMGC can effectively improve the accuracy of subsequent link prediction and node classification.
Aiming at the problem of model information leakage caused by interpretability in Deep Neural Network (DNN), the feasibility of using the Gradient-weighted Class Activation Mapping (Grad-CAM) interpretation method to generate adversarial samples in a white-box environment was proved, moreover, an untargeted black-box attack algorithm named dynamic genetic algorithm was proposed. In the algorithm, first, the fitness function was improved according to the changing relationship between the interpretation area and the positions of the disturbed pixels. Then, through multiple rounds of genetic algorithm, the disturbance value was continuously reduced while increasing the number of the disturbed pixels, and the set of result coordinates of each round would be maintained and used in the next round of iteration until the perturbed pixel set caused the predicted label to be flipped without exceeding the perturbation boundary. In the experiment part, the average attack success rate under the AlexNet, VGG-19, ResNet-50 and SqueezeNet models of the proposed algorithm was 92.88%, which was increased by 16.53 percentage points compared with that of One pixel algorithm, although with the running time increased by 8% compared with that of One pixel algorithm. In addition, in a shorter running time, the proposed algorithm had the success rate higher than the Adaptive Fast Gradient Sign Method (Ada-FGSM) algorithm by 3.18 percentage points, higher than the Projection & Probability-driven Black-box Attack (PPBA) algorithm by 8.63 percentage points, and not much different from Boundary-attack algorithm. The results show that the dynamic genetic algorithm based on the interpretation method can effectively execute the adversarial attack.
In order to solve the problems that the iterative efficiencies of the existing privacy protection k-means clustering schemes are low, the server in the centralized differential privacy preserving k-means clustering scheme may be attacked, and the server in the localized differential privacy protection k-means clustering scheme may return wrong clustering results, a Multi-party Privacy Protection k-means Clustering Scheme based on Blockchain (M-PPkCS/B) was proposed. Taking advantages of localized differential privacy technology and the characteristics of the blockchain such as being open, transparent, and non-tamperable, firstly, a Multi-party k-means Clustering Center Initialization Algorithm (M-kCCIA) was designed to improve the iterative efficiency of clustering while protecting user privacy, and ensure the correctness of initial clustering centers jointly generated by the users. Then, a Blockchain-based Privacy Protection k-means Clustering Algorithm (Bc-PPkCA) was designed, and a smart contract of clustering center updating algorithm was constructed. The clustering center was updated iteratively by the above smart contract on the blockchain to ensure that each user was able to obtain the correct clustering results. Through experiments on the datasets HTRU2 and Abalone, the results show that while ensuring that each user obtains the correct clustering results, the accuracy can reach 97.53% and 96.19% respectively, the average iteration times of M-kCCIA is 5.68 times and 2.75 times less than that of the algorithm of randomly generating initial cluster center called Random Selection (RS).
Aiming at the limited scalability of medical data sharing based on traditional blockchains, a scale-out and sharing scheme of blockchain based on sharding technology was proposed. Firstly, the periodic network sharding was performed based on the jump consistent hash algorithm, and the risk of Sybil attacks in a single shard was greatly reduced by randomly dividing the network nodes. Then, the Scalable decentralized Trust inFrastructure for Blockchains (SBFT) consensus protocol was used in the shards to reduce the high communication complexity of the Pratic Byzantic Fault Torent (PBFT) consensus protocol, and the two-layer architecture was used between the physical multi-chain of shards and the logical single chain of the main chain to reduce the storage pressure of the members of shards. Finally, a multi-keyword association retrieval searchable encryption sharing scheme based on Public key Encryption with Conjunctive field Keyword Search (PECKS) was proposed on the medical consortium blockchain, so as to improve the patients’ control over their sensitive data, and realize the fine-grained search of sensitive data under encryption. Through performance analysis, it can be seen that under the parallel sharding structure, the throughput of blockchain is significantly increased with the increase of shards, and the retrieval efficiency is also significantly improved. Experimental results show that the proposed scheme can greatly improve the efficiency and scalability of the blockchain system.
In view of the problems of traditional file sharing schemes, such as easy leakage of files, difficult control of file destination, and complex access control, as well as the application requirements of cloud file hierarchical classification management and sharing, a hierarchical file access control scheme with identity-based multi-conditional proxy re-encryption was proposed. Firstly, the permission level of file was taken as the condition of ciphertext generation, and the trusted hierarchical management unit was introduced to determine and manage the user levels. Secondly, the re-encryption key of user’s hierarchical access permission was generated, which solved the problem that the identity-based conditional proxy re-encryption scheme only restricts the re-encryption behavior of proxy servers, and lacks the limitation of the user’s permission. Meanwhile, the burden of client was reduced, which means only encryption and decryption operations were needed for users. The results of comparison and analysis of different schemes show that, compared with the existing access control schemes, the proposed scheme has obvious advantages, it can complete the update of the user’s access permission without the direct participation of users, and has the characteristic of uploader anonymity.
Compressed sensing mainly contains random projection and reconstruction. Because of lower convergence speed of iterative shrinkage algorithm and the lacking of direction of traditional 2-dimensional wavelet transform, random projection was implemented by using Permute Discrete Cosine Transform (PDCT), and the gradient projection was used for reconstruction. Based on the simplification of computation complexity, the transformation coefficients in the dual-tree complex wavelet domain were improved by iteration. Finally, the reconstructed image was obtained by the inverse transform. In the experiments, the reconstruction results of DT CWT (Dual-Tree Complex Wavelet Transform) and bi-orthogonal wavelet were compared with the same reconstruction algorithm, and the former is better than the latter in image detail and smoothness with higher Peak Signal-to-Noise Ratio (PSNR) of 1.5 dB. In the same sparse domain, gradient projection converges faster than iterative shrinkage algorithm. And in the same sparse domain and random projection, PDCT has a slightly higher PSNR than the structural random matrix.